Data Source

Source: US Environmental Protection Agency EJSCREEN Tool, 2020 data (last modified 7/1/21)

About the Data

EJSCREEN is an “environmental justice (EJ) mapping and screening tool” produced by the EPA.

Variable Descriptions

glimpse(ejscreen)
## Rows: 155
## Columns: 36
## $ ID         <dbl> 510030101001, 510030101002, 510030101003, 510030102011, 510…
## $ PRE1960PCT <dbl> 0.071991001, 0.298299845, 0.256756757, 0.027385892, 0.08238…
## $ DSLPM      <dbl> 0.1316935, 0.1316935, 0.1316935, 0.1974282, 0.1974282, 0.21…
## $ CANCER     <dbl> 24.95765, 24.95765, 24.95765, 28.33822, 28.33822, 28.15892,…
## $ RESP       <dbl> 0.3120049, 0.3120049, 0.3120049, 0.3656255, 0.3656255, 0.35…
## $ PTRAF      <dbl> 0.00000000, 0.00000000, 0.31287689, 286.71433099, 4.0751412…
## $ PWDIS      <dbl> 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.0…
## $ PNPL       <dbl> 0.03555328, 0.05853532, 0.06053500, 0.02780625, 0.03162721,…
## $ PRMP       <dbl> 0.05435428, 0.09491017, 0.14614959, 0.05496790, 0.04548612,…
## $ PTSDF      <dbl> 0.10236159, 0.06039390, 0.08200253, 0.21222923, 0.23416369,…
## $ OZONE      <dbl> 41.64513, 41.64513, 41.64513, 41.65850, 41.65850, 41.65348,…
## $ PM25       <dbl> 7.241029, 7.241029, 7.241029, 7.386133, 7.386133, 7.364257,…
## $ P_LDPNT    <dbl> 33.48404, 62.76819, 58.92123, 21.17666, 35.58471, 51.33492,…
## $ P_DSLPM    <dbl> 7.637395, 7.637395, 7.637395, 18.744228, 18.744228, 21.5940…
## $ P_CANCR    <dbl> 22.04262, 22.04262, 22.04262, 35.76293, 35.76293, 34.95291,…
## $ P_RESP     <dbl> 19.50741, 19.50741, 19.50741, 32.45459, 32.45459, 30.91068,…
## $ P_PTRAF    <dbl> 0.000000, 0.000000, 5.341820, 56.287080, 8.495607, 7.452488…
## $ P_PWDIS    <dbl> 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.000…
## $ P_PNPL     <dbl> 31.93691, 48.19756, 49.36919, 25.50034, 28.75482, 36.94343,…
## $ P_PRMP     <dbl> 5.555356, 14.571876, 26.639997, 5.670226, 3.932652, 8.65649…
## $ P_PTSDF    <dbl> 14.103431, 7.377108, 10.976983, 26.610963, 28.217063, 21.23…
## $ P_OZONE    <dbl> 40.64001, 40.64001, 40.64001, 40.75917, 40.75917, 40.71748,…
## $ P_PM25     <dbl> 16.10605, 16.10605, 16.10605, 18.16651, 18.16651, 17.83501,…
## $ T_LDPNT    <chr> "0.072 = fraction pre-1960 (33%ile)", "0.3 = fraction pre-1…
## $ T_DSLPM    <chr> "0.132 ug/m3 (7%ile)", "0.132 ug/m3 (7%ile)", "0.132 ug/m3 …
## $ T_CANCR    <chr> "25 lifetime risk per million (22%ile)", "25 lifetime risk …
## $ T_RESP     <chr> "0.31  (19%ile)", "0.31  (19%ile)", "0.31  (19%ile)", "0.37…
## $ T_PTRAF    <chr> NA, NA, "0.31 daily vehicles/meters distance (5%ile)", "290…
## $ T_PWDIS    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "0.0000…
## $ T_PNPL     <chr> "0.036 sites/km distance (31%ile)", "0.059 sites/km distanc…
## $ T_PRMP     <chr> "0.054 facilities/km distance (5%ile)", "0.095 facilities/k…
## $ T_PTSDF    <chr> "0.1 facilities/km distance (14%ile)", "0.06 facilities/km …
## $ T_OZONE    <chr> "41.6 ppb (40%ile)", "41.6 ppb (40%ile)", "41.6 ppb (40%ile…
## $ T_PM25     <chr> "7.24 ug/m3 (16%ile)", "7.24 ug/m3 (16%ile)", "7.24 ug/m3 (…
## $ AREALAND   <dbl> 74984742, 227406673, 60429005, 39270835, 30830002, 28336114…
## $ AREAWATER  <dbl> 463193, 501595, 852995, 367647, 395358, 844351, 1021277, 50…

Observations are block group estimates of key environmental indicators:

  • Lead paint (PRE1960PCT)
  • Particulate matter levels in the air (DSLPM and PM25)
  • Air toxics cancer risk (CANCER)
  • Air toxics respiratory hazard index (RESP)
  • Traffic proximity (PTRAF)
  • Proximity to National Priorities List sites (PNPL)
  • Proximity to Risk Management Plan facilities (PRMP)
  • Proximity to Treatment Storage and Disposal facilities (PTSDF)
  • Ozone level in the air (OZONE)
  • Major direct dischargers to water (PWDIS)

P_ indicates percentile ranks for each variable, and T_ indicates map popup text.

meta %>% 
  mutate(label = paste0(varname, ": ", about)) %>% 
  select(label) %>% 
  as.list()
## $label
##  [1] "ID: 12-digit FIPS block group code"                                                                             
##  [2] "PRE1960PCT: % of housing built before 1960 -- lead paint indicator"                                             
##  [3] "DSLPM: Diesel particulate matter level in the air, measured in micrograms per cubic meter"                      
##  [4] "CANCER: Cancer risk due to toxics in the air"                                                                   
##  [5] "RESP: \"Ratio of exposure concentration to health-based reference concentration\""                              
##  [6] "PTRAF: Average number of daily vehicles at major roads divided by distance in meters"                           
##  [7] "PWDIS: Toxicity-weighted stream concentrations divided by distance in kilometers"                               
##  [8] "PNPL: Number of National Priorities List (NPL) sites within 5 km divided by distance in kilometers"             
##  [9] "PRMP: Number of Risk Management Plan (RMP) facilities within 5 km divided by distance in kilometers"            
## [10] "PTSDF: Number of Treatment Storage and Disposal (TSDF) facilities within 5 km divided by distance in kilometers"
## [11] "OZONE: Summer daily average of ozone concentration in the air, in parts per billion"                            
## [12] "PM25: Yearly average PM2.5 level in the air, measured in micrograms per cubic meter"                            
## [13] "P_LDPNT: Nationwide percentile score for lead paint indicator (from 0-100)"                                     
## [14] "P_DSLPM: Nationwide percentile score for diesel particulate matter level (from 0-100)"                          
## [15] "P_CANCR: Nationwide percentile score for cancer risk (from 0-100)"                                              
## [16] "P_RESP: Nationwide percentile score for respiratory hazard index (from 0-100)"                                  
## [17] "P_PTRAF: Nationwide percentile score for proximity to traffic (from 0-100)"                                     
## [18] "P_PWDIS: Nationwide percentile score for major direct dischargers to water (from 0-100)"                        
## [19] "P_PNPL: Nationwide percentile score for proximity to NPL sites (from 0-100)"                                    
## [20] "P_PRMP: Nationwide percentile score for proximity to RMP facilities (from 0-100)"                               
## [21] "P_PTSDF: Nationwide percentile score for proximity to TSDF facilities (from 0-100)"                             
## [22] "P_OZONE: Nationwide percentile score for ozone level (from 0-100)"                                              
## [23] "P_PM25: Nationwide percentile score for PM2.5 level (from 0-100)"                                               
## [24] "T_LDPNT: Map text for lead paint indicator"                                                                     
## [25] "T_DSLPM: Map text for diesel particulate matter level"                                                          
## [26] "T_CANCR: Map text for cancer risk"                                                                              
## [27] "T_RESP: Map text for respiratory hazard index"                                                                  
## [28] "T_PTRAF: Map text for proximity to traffic"                                                                     
## [29] "T_PWDIS: Map text for major direct dischargers to water"                                                        
## [30] "T_PNPL: Map text for proximity to NPL sites"                                                                    
## [31] "T_PRMP: Map text for proximity to RMP facilities"                                                               
## [32] "T_PTSDF: Map text for proximity to TSDF facilities"                                                             
## [33] "T_OZONE: Map text for ozone level"                                                                              
## [34] "T_PM25: Map text for PM2.5 level"                                                                               
## [35] "AREALAND: Land area (in square meters)"                                                                         
## [36] "AREAWATER: Water area (in square meters)"

Summary Statistics

ejscreen %>% select(-c(ID, T_LDPNT:T_PM25)) %>%
  select(where(~is.numeric(.x) && !is.na(.x))) %>%
  as.data.frame() %>%
  stargazer(., type = "text", title = "Summary Statistics", digits = 0,
            summary.stat = c("mean", "sd", "min", "median", "max"))
## 
## Summary Statistics
## ===============================================================
## Statistic     Mean     St. Dev.    Min     Median       Max    
## ---------------------------------------------------------------
## PRE1960PCT     0          0         0        0           1     
## DSLPM          0          0         0        0           1     
## CANCER         30         3        23        29         34     
## RESP           0          0         0        0           0     
## PTRAF         248        378        0       67.4       2,184   
## PWDIS          0          0         0      0.000         1     
## PNPL           0          0         0        0           0     
## PRMP           0          0         0        0           1     
## PTSDF          1          1         0        0           3     
## OZONE          41         0        41        41         42     
## PM25           7          0         7        7           8     
## P_LDPNT        47         22       11        48         98     
## P_DSLPM        34         21        6        25         73     
## P_CANCR        43         13       16        40         62     
## P_RESP         39         13       13        37         60     
## P_PTRAF        34         29        0       29.1        92     
## P_PWDIS        30         26        0       41.3        95     
## P_PNPL         35         14       16        33         95     
## P_PRMP         13         14        2        7          79     
## P_PTSDF        34         21        4        32         74     
## P_OZONE        38         3        33        39         43     
## P_PM25         19         2        13        19         23     
## AREALAND   35,769,278 46,646,067 157,587 10,727,976 227,406,673
## AREAWATER   474,082   1,484,460     0     106,800   12,681,090 
## ---------------------------------------------------------------

Visual Distributions

Correlation Matrices

The following charts show the correlations between all combinations of variables. The darker the color, the more highly correlated a pair of variables are. The first correlation matrix shows correlations among the levels of each environmental indicator, and the second shows correlations among the percentiles of each indicator.

correlation <- ejscreen %>% 
  select(PRE1960PCT:PM25)
num_correlation <- cor(correlation, use = "complete.obs")
num_correlation <- round(num_correlation, digits = 2)
corrplot(num_correlation, type = {"upper"}, method = "shade", 
         shade.col = NA, tl.col = "black", 
         diag = F, addCoef.col = "black")

meta %>% 
  filter(varname %in% c("PRE1960PCT", "DSLPM", "CANCER", "RESP", "PTRAF", "PWDIS", "PNPL", "PRMP", "PTSDF", "OZONE", "PM25")) %>% 
  mutate(label = paste0(varname, ": ", about)) %>% 
  select(label) %>% 
  as.list()
## $label
##  [1] "PRE1960PCT: % of housing built before 1960 -- lead paint indicator"                                             
##  [2] "DSLPM: Diesel particulate matter level in the air, measured in micrograms per cubic meter"                      
##  [3] "CANCER: Cancer risk due to toxics in the air"                                                                   
##  [4] "RESP: \"Ratio of exposure concentration to health-based reference concentration\""                              
##  [5] "PTRAF: Average number of daily vehicles at major roads divided by distance in meters"                           
##  [6] "PWDIS: Toxicity-weighted stream concentrations divided by distance in kilometers"                               
##  [7] "PNPL: Number of National Priorities List (NPL) sites within 5 km divided by distance in kilometers"             
##  [8] "PRMP: Number of Risk Management Plan (RMP) facilities within 5 km divided by distance in kilometers"            
##  [9] "PTSDF: Number of Treatment Storage and Disposal (TSDF) facilities within 5 km divided by distance in kilometers"
## [10] "OZONE: Summer daily average of ozone concentration in the air, in parts per billion"                            
## [11] "PM25: Yearly average PM2.5 level in the air, measured in micrograms per cubic meter"
correlation2 <- ejscreen %>% 
  select(P_LDPNT:P_PM25)
num_correlation2 <- cor(correlation2, use = "complete.obs")
num_correlation2 <- round(num_correlation2, digits = 2)
corrplot(num_correlation2, type = {"upper"}, method = "shade", 
         shade.col = NA, tl.col = "black", 
         diag = F, addCoef.col = "black")

meta %>% 
  filter(varname %in% c("P_LDPNT", "P_DSLPM", "P_CANCR", "P_RESP", "P_PTRAF", "P_PWDIS", "P_PNPL", "P_PRMP", "P_PTSDF", "P_OZONE", "P_PM25")) %>% 
  mutate(label = paste0(varname, ": ", about)) %>% 
  select(label) %>% 
  as.list()
## $label
##  [1] "P_LDPNT: Nationwide percentile score for lead paint indicator (from 0-100)"             
##  [2] "P_DSLPM: Nationwide percentile score for diesel particulate matter level (from 0-100)"  
##  [3] "P_CANCR: Nationwide percentile score for cancer risk (from 0-100)"                      
##  [4] "P_RESP: Nationwide percentile score for respiratory hazard index (from 0-100)"          
##  [5] "P_PTRAF: Nationwide percentile score for proximity to traffic (from 0-100)"             
##  [6] "P_PWDIS: Nationwide percentile score for major direct dischargers to water (from 0-100)"
##  [7] "P_PNPL: Nationwide percentile score for proximity to NPL sites (from 0-100)"            
##  [8] "P_PRMP: Nationwide percentile score for proximity to RMP facilities (from 0-100)"       
##  [9] "P_PTSDF: Nationwide percentile score for proximity to TSDF facilities (from 0-100)"     
## [10] "P_OZONE: Nationwide percentile score for ozone level (from 0-100)"                      
## [11] "P_PM25: Nationwide percentile score for PM2.5 level (from 0-100)"

Ozone vs. PM2.5

These scatterplots show the relationship between ozone and PM2.5, broken down by county. The first scatterplot shows the correlation between the levels of ozone and PM2.5, and the second scatterplot shows the correlation between the percentiles.

ejscreen <- ejscreen %>% 
  mutate(county = str_sub(ID, 3, 5))

ejscreen %>%
  ggplot() +
  geom_point(aes(x=OZONE, y=PM25, color=county)) +
  labs(x="Ozone level",
       y="PM2.5 level") +
  scale_color_brewer(type = "qual", labels = c("Albemarle", "Fluvanna", "Greene", "Louisa", "Nelson", "Charlottesville"))

ejscreen %>%
  ggplot() +
  geom_point(aes(x=P_OZONE, y=P_PM25, color=county)) +
  labs(x="Ozone percentile",
       y="PM2.5 percentile") +
  scale_color_brewer(type = "qual", labels = c("Albemarle", "Fluvanna", "Greene", "Louisa", "Nelson", "Charlottesville"))

Proximity to traffic vs. air toxics cancer risk

These scatterplots show the relationship between a block group’s proximity to traffic and air toxics cancer risk, broken down by county. The first one shows the correlation between the levels, and the second one shows the correlation between the percentiles.

ejscreen %>%
  ggplot() +
  geom_point(aes(x=PTRAF, y=CANCER, color=county)) +
  labs(x="Proximity to traffic",
       y="Cancer risk") +
  scale_color_brewer(type = "qual", labels = c("Albemarle", "Fluvanna", "Greene", "Louisa", "Nelson", "Charlottesville"))

ejscreen %>%
  ggplot() +
  geom_point(aes(x=P_PTRAF, y=P_CANCR, color=county)) +
  labs(x="Traffic proximity percentile",
       y="Cancer risk percentile") +
  scale_color_brewer(type = "qual", labels = c("Albemarle", "Fluvanna", "Greene", "Louisa", "Nelson", "Charlottesville"))

Proximity to traffic vs. diesel particulate matter level

These scatterplots show the relationship between a block group’s proximity to traffic and its diesel particulate matter level, broken down by county. The first shows the correlation between the levels, and the second shows the correlation between the percentiles.

ejscreen %>%
  ggplot() +
  geom_point(aes(x=PTRAF, y=DSLPM, color=county)) +
  labs(x="Proximity to traffic",
       y="Diesel particulate matter level") +
  scale_color_brewer(type = "qual", labels = c("Albemarle", "Fluvanna", "Greene", "Louisa", "Nelson", "Charlottesville"))

ejscreen %>%
  ggplot() +
  geom_point(aes(x=P_PTRAF, y=P_DSLPM, color=county)) +
  labs(x="Traffic proximity percentile",
       y="Diesel particulate matter percentile") +
  scale_color_brewer(type = "qual", labels = c("Albemarle", "Fluvanna", "Greene", "Louisa", "Nelson", "Charlottesville"))

PM2.5 vs. diesel particulate matter level

These scatterplots show the relationship between PM2.5 and diesel particulate matter, broken down by county. The first shows the correlation between the levels, and the second shows the correlation between the percentiles.

ejscreen %>%
  ggplot() +
  geom_point(aes(x=PM25, y=DSLPM, color=county)) +
  labs(x="PM2.5 level",
       y="Diesel particulate matter level") +
  scale_color_brewer(type = "qual", labels = c("Albemarle", "Fluvanna", "Greene", "Louisa", "Nelson", "Charlottesville"))

ejscreen %>%
  ggplot() +
  geom_point(aes(x=P_PM25, y=P_DSLPM, color=county)) +
  labs(x="PM2.5 percentile",
       y="Diesel particulate matter percentile") +
  scale_color_brewer(type = "qual", labels = c("Albemarle", "Fluvanna", "Greene", "Louisa", "Nelson", "Charlottesville"))

Spatial Distributions

Proximity to treatment storage and disposal facilities (TSDFs)

pal <- colorNumeric("Blues", reverse = TRUE, domain = cvilleshapes$PTSDF)
leaflet(cvilleshapes) %>%
  addProviderTiles("CartoDB.Positron") %>%
  addPolygons(data = cvilleshapes,
              fillColor = ~pal(PTSDF),
              weight = 1,
              opacity = 1,
              color = "white",
              fillOpacity = 0.6,
              highlight = highlightOptions(weight = 2, fillOpacity = 0.8, bringToFront = T),
              popup = paste0("FIPS Code: ", cvilleshapes$GEOID, "<br>",
                             "Proximity to TSDF: ", cvilleshapes$T_PTSDF)) %>%
  addLegend("bottomright", pal = pal, values = cvilleshapes$PTSDF,
            title = "Proximity to TSDF", opacity = 0.7)

Proximity to traffic

pal <- colorNumeric("Blues", reverse = TRUE, domain = cvilleshapes$PTRAF)
leaflet(cvilleshapes) %>%
  addProviderTiles("CartoDB.Positron") %>%
  addPolygons(data = cvilleshapes,
              fillColor = ~pal(PTRAF),
              weight = 1,
              opacity = 1,
              color = "white",
              fillOpacity = 0.6,
              highlight = highlightOptions(weight = 2, fillOpacity = 0.8, bringToFront = T),
              popup = paste0("FIPS Code: ", cvilleshapes$GEOID, "<br>",
                             "Proximity to traffic: ", cvilleshapes$T_PTRAF)) %>%
  addLegend("bottomright", pal = pal, values = cvilleshapes$PTRAF,
            title = "Traffic Proximity", opacity = 0.7)

PM2.5 Percentile

pal <- colorNumeric("Blues", reverse = TRUE, domain = cvilleshapes$P_PM25)
leaflet(cvilleshapes) %>%
  addProviderTiles("CartoDB.Positron") %>%
  addPolygons(data = cvilleshapes,
              fillColor = ~pal(P_PM25),
              weight = 1,
              opacity = 1,
              color = "white",
              fillOpacity = 0.6,
              highlight = highlightOptions(weight = 2, fillOpacity = 0.8, bringToFront = T),
              popup = paste0("FIPS Code: ", cvilleshapes$GEOID, "<br>",
                             "PM2.5 Percentile: ", cvilleshapes$P_PM25)) %>%
  addLegend("bottomright", pal = pal, values = cvilleshapes$P_PM25,
            title = "PM2.5 Percentiles", opacity = 0.7)

Distribution of PM2.5

pal <- colorNumeric("Blues", reverse = TRUE, domain = cvilleshapes$PM25)
leaflet(cvilleshapes) %>%
  addProviderTiles("CartoDB.Positron") %>%
  addPolygons(data = cvilleshapes,
              fillColor = ~pal(PM25),
              weight = 1,
              opacity = 1,
              color = "white",
              fillOpacity = 0.6,
              highlight = highlightOptions(weight = 2, fillOpacity = 0.8, bringToFront = T),
              popup = paste0("FIPS Code: ", cvilleshapes$GEOID, "<br>",
                             "PM2.5 Level: ", cvilleshapes$T_PM25)) %>%
  addLegend("bottomright", pal = pal, values = cvilleshapes$PM25,
            title = "PM2.5 Concentrations", opacity = 0.7)

Air toxics cancer risk

pal <- colorNumeric("Blues", reverse = TRUE, domain = cvilleshapes$CANCER)
leaflet(cvilleshapes) %>%
  addProviderTiles("CartoDB.Positron") %>%
  addPolygons(data = cvilleshapes,
              fillColor = ~pal(CANCER),
              weight = 1,
              opacity = 1,
              color = "white",
              fillOpacity = 0.6,
              highlight = highlightOptions(weight = 2, fillOpacity = 0.8, bringToFront = T),
              popup = paste0("FIPS Code: ", cvilleshapes$GEOID, "<br>",
                             "Cancer Risk: ", cvilleshapes$T_CANCR)) %>%
  addLegend("bottomright", pal = pal, values = cvilleshapes$CANCER,
            title = "Air Toxics Cancer Risk", opacity = 0.7)

Diesel Particulate Matter Level

pal <- colorNumeric("Blues", reverse = TRUE, domain = cvilleshapes$DSLPM)
leaflet(cvilleshapes) %>%
  addProviderTiles("CartoDB.Positron") %>%
  addPolygons(data = cvilleshapes,
              fillColor = ~pal(DSLPM),
              weight = 1,
              opacity = 1,
              color = "white",
              fillOpacity = 0.6,
              highlight = highlightOptions(weight = 2, fillOpacity = 0.8, bringToFront = T),
              popup = paste0("FIPS Code: ", cvilleshapes$GEOID, "<br>",
                             "DSLPM: ", cvilleshapes$T_DSLPM)) %>%
  addLegend("bottomright", pal = pal, values = cvilleshapes$DSLPM,
            title = "Diesel Particulate Matter Level", opacity = 0.7)

Ozone Levels

pal <- colorNumeric("Blues", reverse = TRUE, domain = cvilleshapes$OZONE)
leaflet(cvilleshapes) %>%
  addProviderTiles("CartoDB.Positron") %>%
  addPolygons(data = cvilleshapes,
              fillColor = ~pal(OZONE),
              weight = 1,
              opacity = 1,
              color = "white",
              fillOpacity = 0.6,
              highlight = highlightOptions(weight = 2, fillOpacity = 0.8, bringToFront = T),
              popup = paste0("FIPS Code: ", cvilleshapes$GEOID, "<br>",
                             "Ozone Level: ", cvilleshapes$T_OZONE)) %>%
  addLegend("bottomright", pal = pal, values = cvilleshapes$OZONE,
            title = "Ozone Levels in the Air", opacity = 0.7)

Important Notes

PM2.5, ozone, and NATA indicators (cancer risk, respiratory hazard index, and diesel particulate matter) are measured at the census tract level, and the same value is assigned to each block group within that tract. All other variables were collected at the block group level.